Expression, purification, and functional characterization of soluble recombinant full-length simian immunodeficiency virus (SIV) Pr55Gag

The simian immunodeficiency virus (SIV) precursor polypeptide Pr55Gag drives viral assembly and facilitates specific recognition and packaging of the SIV genomic RNA (gRNA) into viral particles. While several studies have tried to elucidate the role of SIV Pr55Gag by expressing its different components independently, studies using full-length SIV Pr55Gag have not been conducted, primarily due to the unavailability of purified and biologically active full-length SIV Pr55Gag. We successfully expressed soluble, full-length SIV Pr55Gag with His6-tag in bacteria and purified it using affinity and gel filtration chromatography. In the process, we identified within Gag, a second in-frame start codon downstream of a putative Shine-Dalgarno-like sequence resulting in an additional truncated form of Gag. Synonymously mutating this sequence allowed expression of full-length Gag in its native form. The purified Gag assembled into virus-like particles (VLPs) in vitro in the presence of nucleic acids, revealing its biological functionality. In vivo experiments also confirmed formation of functional VLPs, and quantitative reverse transcriptase PCR demonstrated efficient packaging of SIV gRNA by these VLPs. The methodology we employed ensured the availability of >95% pure, biologically active, full-length SIV Pr55Gag which should facilitate future studies to understand protein structure and RNA-protein interactions involved during SIV gRNA packaging.


Introduction
Retroviruses are present across different species of the animal kingdom and are particularly widespread in mammals, including humans in which they can cause various ailments [1,2]. Some retroviruses cause immunodeficiency syndromes like the human, feline,

SIV Pr55 Gag -His 6 -tagged protein expression in bacteria
A pET28b(+) based recombinant bacterial expression plasmid containing the full-length 1.5 kb SIV Pr55 gag sequences with a Cterminal hexa-histidine tag was cloned, sequenced and named VP77 (Fig. 1). VP77 was transformed into BL21 bacterial cells for recombinant protein expression. Bacterial cultures were grown suboptimally (at 28 • C) following induction with 0.4 mM IPTG (isopropyl β-D-1-thiogalactopyranoside), required for induction of Gag gene expression. Suboptimal temperature (28 • C) was employed to avoid aberrant assembly of viral proteins in inclusion bodies, as has been reported earlier [8]. Such a strategy has worked very well in our hands for the expression of full-length retroviral Gag proteins from other retroviruses [75][76][77]. Bacterial cultures were harvested at different time points (0, 2, 4, 6, and 18 h) and protein expression was examined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) (Fig. 2).
Expression of the recombinant SIV Pr55 Gag -His 6 -tag was confirmed by Coomassie Brilliant Blue staining as observed by the appearance of a band of ~55 kDa in size, in a time-dependent manner (lanes 4-6, Fig. 2). BL21 cultures transformed with the empty pET28b(+) expression vector served as the negative control and, the total lysates from pET28b(+) culture showed no protein expression (lane 2, Fig. 2). The low level of protein expression observed at 0 h (prior to induction (lane 3, Fig. 2) can be attributed to the T7 promoter of pET28b(+) vector which has been shown to be leaky in nature [80]. The leaky protein expression did not have any deleterious effect on the bacterial cells as uninterrupted protein expression could be observed in the cells for up to 18 h after induction.

SIV Pr55 Gag -His 6 -tagged protein expression in the soluble bacterial fraction
After establishing that the molecular clone VP77 was expressing SIV Pr55 Gag -His 6 -tagged fusion protein, the next challenge was to determine the solubility of this protein. To establish the solubility of the SIV Pr55 Gag -His 6 fusion protein, the SIV Pr55 Gag -His 6 -tagged protein expression vector (VP77), was transformed into BL21 cells and cultures were harvested at 0, 2, 4, 6 and 18 h following suboptimal induction at 28 • C. CelLytic B buffer supplemented with EDTA-free protease inhibitor, benzonase, and lysozyme, was used to lyse the pelleted bacterial cultures (see Materials and Methods). Following lysis and high-speed centrifugation to pellet down the  insoluble material (cell debris and inclusion bodies), the supernatant from these cultures was analyzed by SDS-PAGE. As observed in the total bacterial lysates, the Coomassie Brilliant Blue-stained gel showed a distinct protein band of ~55 kDa corresponding to SIV Pr55 Gag -His 6 -tag fusion protein (lanes 4-6, Fig. 3A). Western blotting of the soluble fractions with antibodies against His 6 -tag and SIV α-p27 further confirmed that the protein bands at ~55 kDa were SIV Pr55 Gag with an intact C-terminal His 6 -tag ( Fig. 3B and C, respectively). The highest protein expression was observed at 4 h of induction with 0.4 mM IPTG (lane 5, Fig. 3B and C). These results revealed that the SIV Pr55 Gag -His 6 -tagged protein expressed from VP77 was expressed in soluble bacterial fractions with the highest expression observed 4 h after induction with IPTG.

Identification of an alternative internal initiation codon downstream of a potential Shine-Dalgarno-like ribosome binding site in SIV Gag
Another significant observation made from results shown in Fig. 3 was the presence of a conspicuous protein band with a lower molecular weight of ~44 kDa (below our target SIV Pr55 Gag protein) in all the samples tested ( Fig. 3B and C). A similar scenario has been observed previously during the bacterial expression and purification of MMTV Pr77 Gag -His 6 -tag fusion protein where the lower molecular weight band was found to be due to a second in-frame start codon [75]. Based on these observations, a careful study of the SIV gag nucleotide sequence revealed the presence of a second in-frame start codon (ATG) at nucleotide 1660 (351 nucleotides downstream of the native ATG), and another Shine-Dalgarno-like sequence (5' ACA GGA ACA 3') 9 nucleotides ahead of this internal ATG, which facilitated the expression of this truncated protein (Fig. 4A). showing the Shine-Dalgarno-like sequence (underlined) which is 9 nucleotides upstream of the second in-frame initiation codon ATG (nucleotide 1660, in pink) and, its corresponding amino acids (shown below the sequence). (B) Table listing the predicted translation initiation rates from the actual start codon (in blue) and the second in-frame start codon (in pink) for the wild type VP77 followed by potential synonymous mutations (in red) in the region containing the Shine-Dalgarno-like sequence to disrupt it and inhibit translation from the second in-frame start codon. The mutation which yielded the least translation initiation rate from the second start codon was named VP80. (C) Graphical comparison of the predicted translation rates from the actual start codon (blue) and, the second in-frame start codon (pink), in the wild type VP77 and mutant VP80 containing the altered Shine-Dalgarno-like sequence in recombinant full-length SIV Pr55 Gag . (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.) The UTR Designer, an online software which can predict prokaryotic translation initiation rates (TIR) was employed to further validate our findings [81]. This tool predicted a TIR of 195,789 from the native ATG and a TIR of 81,001 from the second ATG for VP77 (Fig. 4B). A combination of synonymous mutation of the second potential ribosome binding site (Shine-Dalgarno-like sequence) was analyzed using the UTR Designer tool to design VP80, a mutant in which the predicted TIR from the second ATG was reduced to a negligible amount (2073 from 81,001) ( Fig. 4B and C).

IMAC (immobilized metal affinity chromatography) purification of bacterially expressed recombinant SIV Pr55 Gag -His 6 -tagged fusion protein
Based on the above findings, the synonymously mutated SIV Pr55 Gag -His 6 -tag fusion protein expression vector (VP80) was constructed and tested for protein expression after transformation into BL21 cells. Protein expression was induced with 0.4 mM IPTG, and transformed cells were grown at 28 • C for 4 h. As expected, immunoblots of the soluble bacterial fraction obtained from this culture upon lysis was incubated with α-His 6 and SIV α-p27 monoclonal antibodies, individually, to reveal only a distinct band of ~55 kDa, while the ~44 kDa band observed earlier was now absent (lane 2, Fig. 5B and C). This confirmed that the synonymous mutations introduced into the Shine-Dalgarno-like sequence in VP80 were effective in inhibiting translation from the second in-frame ATG. Thus, all ensuing experiments were conducted using the molecular clone VP80 containing the synonymously mutated Shine-Dalgarno-like sequence in SIV Pr55 gag sequences with a C-terminal hexa-histidine tag in the pET28b(+) protein expression vector.
Next, to purify VP80-expressed SIV Pr55 Gag -His 6 -tag fusion protein from other proteins present in the bacterial lysate, IMAC was employed (a HisTrap column was connected to the BIO-RAD NGC liquid chromatography system; see Materials and Methods for details). A Coomassie Brilliant Blue-stained gel where the fractions obtained during and after the IMAC purification were run, provides a detailed representation of this process (Fig. 5A). Following bacterial cell lysis, the soluble bacterial fraction was applied to a Niimmobilized IMAC column under non-denaturing conditions (1 M NaCl). This prevented protein aggregation and precipitation, thereby enhancing protein binding to the column (lane 2, Fig. 5A). The His 6 -tag on the protein bound to the Ni ions on the column; consequently, we observed no protein in the flow through collected from the IMAC column as well as in the column washings using 25 mM imidazole buffer (lanes 3, and 4, Fig. 5A). Following the column wash, the bound protein was eluted using 250 mM imidazole elution buffer (lane 5, Fig. 5A). Subsequent column washes with 500 mM and 1 M imidazole buffers yielded negligible or no protein (lane 6 and 7, Fig. 5A), revealing that almost all of the protein eluted with 250 mM imidazole buffer. Next, the IMAC fraction eluted in 250 mM imidazole buffer was immunoblotted with α-His 6 and SIV α-p27 monoclonal antibodies, revealing distinct bands corresponding to SIV Pr55 Gag -His 6 -tag fusion protein (lane 3, Fig. 5B and C). These observations confirmed both the functional inactivation of the Shine-Dalgarno-like sequence and removal of bacterial proteins. Although IMAC is a powerful protein capture step, the protein obtained is often slightly lower in purity; hence, IMAC can be combined with gel filtration chromatography to remove remaining impurities to increase purity of proteins by 95%-99% [82]. We have successfully isolated high purity recombinant MPMV, MMTV and FIV full-length Gag proteins employing this two-step purification methodology [75][76][77]. Thus, the IMAC-purified protein was subjected to gel filtration chromatography for further purification, as explained in the next section.

Gel filtration chromatography
To proceed with gel filtration, the SIV Pr55 Gag -His 6 -tagged fusion protein eluted from the IMAC column was further concentrated. Towards this end, Amicon® Ultra 15 centrifugal device (30 kDa molecular weight cut-off) was used to reduce the volume to 1 ml fractions containing ~2-4 mg/ml protein. Since gel filtration does not depend on binding, it requires small sample volumes (0.5%-4% of the total column volume) at low flow rates to achieve higher resolution [83]. Thus, the filtered and concentrated protein fraction was injected onto a Superdex 200 Increase 10/300 GL column connected to the BIO-RAD NGC liquid chromatography system under non-denaturing conditions. Like the buffers used during IMAC, a high salt concentration (1 M NaCl) was maintained in the buffer during gel filtration to prevent any aggregation or precipitation of the protein on the column. Elution of the protein as 500 μl fractions from the column was monitored by the NGC system based on absorbance at 280 nm. A sharp peak was observed corresponding to fractions 23-29, while an additional smaller peak was observed within the void volume (V 0 ) of the column which signifies higher molecular weight aggregates that cannot resolve on the column (Fig. 6A).
Fractions 23-29 were subsequently analyzed by SDS-PAGE to measure their protein purity. The Coomassie Brilliant Blue-stained gel reveals clear protein bands at ~55 kDa corresponding to the SIV Pr55 Gag -His 6 -tag fusion protein size ( Fig. 6B(a)). The fractions with the highest amount of clean protein (fractions 24-26) were pooled and concentrated using the Amicon® Ultra 15 centrifugal device (30 kDa molecular weight cut-off) to ~2 mg/ml. Immunoblotting with α-His 6 and SIV α-p27 monoclonal antibodies demonstrated pure SIV Pr55 Gag -His 6 -tag fusion protein ( Fig. 6B (b) and (c), respectively), thereby, establishing the purity of the pooled fractions. The spectrophotometric absorbance of a protein at 260 nm compared to the value measured at 280 nm is a useful measure to determine the purity of an isolated protein where an ideal 260/280 ratio of 0.6 indicates no nucleic acid contamination in the proteins [84]. Spectrophotometric analysis of the pooled SIV Pr55 Gag -His 6 -tag fusion protein revealed a A260/A280 ratio of 0.6, indicating purity of greater than 95% (Fig. 6C).
In our hands, a 1.5-L culture of bacteria yielded ~15 mg of protein after IMAC purification, and only ~5 mg of the pure SIV Pr55 Gag -His 6 -tag fusion protein could be recovered, after gel filtration chromatography. These results provide us with an expression and purification system to over-express full-length SIV Pr55 Gag -His 6 -tag fusion protein of high purity in a soluble form without the use of solubility tags which may confound the interpretations of the downstream applications such as RNA-protein interaction.

In vitro assembly of recombinant SIV Pr55 Gag -His 6 -tagged fusion protein
Subsequent to obtaining purified recombinant SIV Pr55 Gag -His 6 -tag fusion protein, the next objective was to establish its functional characteristics. Bacterially expressed and purified retroviral Gag proteins have been shown to in vitro self-assemble, resulting in viruslike particles (VLPs) in the presence of nucleic acids for most retroviruses, such as HIV-1, FIV, RSV, MPMV and MMTV [. Therefore, we tested whether the purified SIV Pr55 Gag -His 6 -tag fusion protein could self-assemble in vitro and form VLPs in the presence of yeast tRNA (4%;w/w). Briefly, 2 mg/ml of pure protein in high salt buffer (1 M NaCl) was mixed with 4% (w/w) yeast tRNA and dialyzed with a low salt assembly buffer (150 mM NaCl) overnight at 4 • C. The SIV Pr55 Gag -His 6 -tag fusion protein (2 mg/ml) without yeast tRNA served as the negative control, dialyzed in a similar fashion. Both samples were added on to carbon coated grids, dried, and stained for observation under an electron microscope. The electron micrographs revealed VLPs having a typical spherical appearance with an electron-dense central region ranging approximately 80-100 nm in size ( Fig. 7A-C). The size of these immature VLPs corresponded with those that have been reported earlier for HIV-1 and FIV [10,77,86,88,89]. When the in vitro assembly was performed without yeast tRNA, the SIV Pr55 Gag -His 6 -tag fusion protein did not assemble into any VLPs (Fig. 7D-E). These observations confirm that SIV Pr55 Gag requires the presence of nucleic acid to self-assemble, as has been established earlier for HIV-1 and FIV [10,12,77,86,88]. This experiment also revealed that SIV Pr55 Gag -His 6 -tag fusion protein remained biologically functional following a freeze-thaw cycle by retaining its intrinsic property of multimerizing/oligomerizing in vitro to assemble into VLPs. Successful in vitro assembly of SIV Pr55 Gag -His 6 -tag fusion protein into immature VLPs suggests that the absence of post translational modifications like myristoylation of the N-terminal glycine residue which have been described for SIV Gag [90] did not adversely affect the assembly properties of bacterially expressed recombinant SIV Pr55 Gag .  Earlier studies have shown that retroviral Gag or a particular domain of Gag can form VLPs in eukaryotic systems [10,12,[75][76][77]91]. For example, the MA domain of SIV Gag can form VLPs in eukaryotic cells and such a property of VLP formation could be attributed to the presence of a positively charged domain (residues 26-33) that is conserved in both HIV-1 and SIV. Furthermore, SIV MA when co-expressed with an expression plasmid encoding the Env glycoproteins (gp120 and gp41) could be successfully incorporated on MA VLPs [91]. Therefore, to ascertain whether introduction of the His 6 -tag interferes with VLP production in eukaryotic cells, we created a SIV gag eukaryotic expression plasmid, VP78 His(+) containing the native gag sequence with a His 6 -tag at its C-terminus and without synonymously mutating the putative Shine-Dalgarno-like sequence in front of the second in frame Gag start site. The constitutive transport element (CTE) from MPMV was added to this eukaryotic expression plasmid downstream of the SIV gag sequence to enable proper export and translation of the respective mRNA from the nucleus to the cytoplasm [92]. HEK293T cells were transfected with VP78, while cells transfected with only the SIV transfer vector MB41 and no packaging construct served as the negative control (mock). Seventy-two hours post transfection, the transfected cells were trypsinized and processed for visualization of VLP production by TEM. As shown in Fig. 8A-D, electron dense regions could be observed at the plasma membrane from where VLPs were budding out. Such budding VLPs were absent in the negative control (Fig. 8E-F). Most of the particles observed were within the 80-100 nm size which is consistent with earlier reports for immature SIV Gag VLPs without the envelope [93,94]. These findings demonstrate that SIV Pr55 Gag having a C-terminal His 6 -tag did not interfere with Gag expression or its ability to form VLPs in eukaryotic cells.

VLPs produced by SIV Pr55 Gag -His 6 -tagged fusion protein in eukaryotic cells can package SIV unspliced, sub-genomic RNA efficiently
Next, we tested the potential of the recombinant SIV Pr55 Gag -His 6 -tag fusion protein to package SIV RNA. To explore this, the fulllength SIV Gag eukaryotic expression plasmids, VP78 containing His 6 -tag (His+) and VP79 without His 6 -tag (His-), were created (Fig. 9A). This was necessary to ensure that the presence of His 6 -tag did not interfere with the nucleic acid binding ability of SIV Pr55 Gag during gRNA packaging in vivo, as has been suggested earlier in the case of HIV while performing in vitro biochemical assays  [95]. A two-plasmid genetic complementation assay was developed using these plasmids to express SIV Pr55 Gag and make VLPs with and without His 6 -tag, and the SIV transfer vector MB41 [65] which serves as the source of packageable RNA, as shown in Fig. 9A.
Briefly, HEK293T cells were co-transfected individually with either VP78 His(+) or VP79 His(− ) constructs along with MB41, while cells transfected with only the SIV transfer vector MB41 and no Gag expression plasmids served as the negative control (mock). In order to ensure viral particles were produced by both His(+) and His(− ) transfer vectors, immunoblotting was performed using SIV α-p27 monoclonal antibody on the cell lysates prepared from both the transfected cultures and the viral particles pelleted from the transfected supernatants. The immunoblots revealed similar levels of expression of SIV Pr55 Gag in the cytoplasm (Fig. 9B; Panel II) and successful formation of VLPs by both the SIV Pr55 Gag His(+) and His(− ) constructs ( Fig. 9B; Panel III).
Next, RNAs were extracted from both the transfected cultures (cytoplasmic fractions) and VLPs harvested from transfected cultures. DNase-treatment of the RNA preparations was performed followed by PCR with MB41 vector-specific primers (OTR1650 and OTR1651) to demonstrate the absence of any contaminating plasmid DNA from the transfected cultures (data not shown). The DNasetreated RNAs were used to make cDNAs and tested for their relative packaging efficiency (RPE) into the VLPs produced by the SIV Pr55 Gag His(+) and His(− ) expression constructs using our validated SYBR-Green RT-qPCR assay [68]. As can be seen from Fig. 9C, both the His(+) and His(− ) SIV Gag expression constructs (VP78 and VP79, respectively), could successfully package the SIV transfer vector RNA from MB41 with VP78 His(+) showing a slightly better RNA packaging efficiency than VP79 His(− ) which was not with α-β actin monoclonal antibody at 1:25000 dilution and, SIV α-p27 monoclonal antibody used at 1:100 dilution, respectively. Panel III is the immunoblot of the ultracentrifuged VLPs probed with SIV α-p27 monoclonal antibody at 1:100 dilution. (C) Relative RNA packaging efficiency between the VP78 His(+) and VP79 His(− ) SIV Pr55 Gag RNAs respectively, obtained after RT-qPCR. The relative RNA packaging efficiency was determined by dividing viral RNA packaging values to the cytoplasmic expression normalized to secreted alkaline phosphatase (SEAP) expression for the respective clones. The difference in packaging efficiency between the two clones was statistically not significant. Mock contains only the transfer vector and no packaging construct; hence, it cannot be packaged into viral particles and cannot be picked upon western blots. The uncropped western blots are provided in Supplementary Fig. 5. Pillai et al. statistically significant (p value = 0.1121; Fig. 9C). We attribute this statistically insignificant increase in efficiency to the addition of the positively charged His 6 -tag at the C-terminal of SIV Pr55 Gag which may have increased the basic nature of the polyprotein and stabilized its interaction with the RNA, as has been suggested earlier for MPMV Pr78 Gag and MMTV Pr77 Gag [75,76]. These data confirmed that when full-length recombinant SIV Pr55 Gag containing C-terminal His 6 -tag was expressed, it did not affect its biological activity and thus, could efficiently form VLPs with the ability to encapsidate SIV transfer vector RNA.

Conclusions
We report successful cloning, expression, and purification of recombinant full-length SIV Pr55 Gag with a C-terminal His 6 -tag using a prokaryotic expression system. The protein expressed was present in the soluble fraction without employing any solubility tags which are generally bulky proteins or protein domains that may interfere with downstream applications such as RNA-interaction or conformational studies. The identity of the purified protein was established using specific SIV α-p27 as well as α-His 6 monoclonal antibodies. The SIV Pr55 Gag -His 6 -tag fusion protein was able to form VLPs in vitro, while the VLPs formed in the eukaryotic cells were functionally able to encapsidate SIV transfer vector RNA despite the presence of the His 6 -tag. The ability to purify and express large quantities of recombinant SIV Pr55 Gag should facilitate its structural and functional studies, especially those related to understanding the intricacies involved in RNA-protein interactions during SIV gRNA packaging. The availability of recombinant SIV Pr55 Gag should pave the way to perform biochemical studies, such as band-shift and footprinting assays, to learn where on SIV gRNA Gag binds during gRNA packaging as we and others have recently shown for other retroviruses [19,20,25,27,78,79]. Studies using full-length Gag should prevent ambiguities in identifying Gag binding sites on the gRNA that could be missed while using proteins expressed from truncated or partial Gag, such as the NC domain only. Furthermore, many functional aspects of the Gag proteolytic processing and conformational changes in Gag that occur during virus maturation and its oligomerization can now be investigated in depth with the availability of this biologically active and functional SIV Pr55 Gag . Since the protein obtained is >95% pure, this makes it ideal for applications such as drug interaction studies using peptides as pharmaceuticals, structure-function relationship studies, and for the study of protein structure by cryo-electron microscopy. One of the limitations of our study could be the absence of post translational modifications in the bacterially expressed recombinant SIV Pr55 Gag . Native SIV Pr55 Gag undergoes two major post translational modifications, namely proteolytic cleavage of precursor Gag into its subunits after budding [96], and N-terminal myristoylation of the glycine residue which targets Gag towards the plasma membrane for proper assembly [90]. Another possible limitation of the study could be that we did not add zinc during SIV Pr55 Gag purification and/or during in vitro assembly of VLPs which may be important for proper folding of the NC domain of Gag since it has two zinc finger binding domains. However, successful in vitro assembly of VLPs demonstrates that bacterially expressed recombinant SIV Pr55 Gag can assemble properly and thus, is optimal for the above-mentioned studies without these potential caveats.

Nucleotide numbers
Designation of nucleotides are based on the GenBank accession number M33262 for SIVmac239 [3].

Construction of full-length SIV Pr55 Gag prokaryotic expression plasmids
The complete open reading frame (ORF) of SIV full-length Gag (Pr55 Gag ; nucleotides 1309-2839) was chemically synthesized (Macrogen, South Korea) containing restriction enzyme sites namely NcoI at the 5'and XhoI at the 3' end. For the ease of cloning, two inherent NcoI sites (CCATGG) in the Gag ORF at nucleotide positions 2623 and 2651 were modified to CCGTGG and CAATGG, respectively, to inactivate the NcoI sites while maintaining the native amino acid proline at both locations (Fig. 1C). The chemically synthesized full-length Gag ORF and the bacterial expression vector pET28b(+) were cleaved with NcoI and XhoI restriction enzymes and ligated together to generate clone VP77. Thus, VP77 expressed an in-frame recombinant fusion protein comprising of full-length SIV Pr55 Gag and a hexa-histidine tag at its C-terminus (Pr55 Gag -His 6 -tag) as shown in Fig. 1, with a predicted molecular weight of 58.1 kDa calculated by ExPASy-pI/Mw tool [97]. Preliminary expression of the SIV Pr55 Gag revealed the presence of a Shine-Dalgarno-like sequence (5' ACAGGAACA 3') 9 nucleotides ahead of a second in-frame start codon (ATG) located at nucleotide 1660, resulting in the expression of a shortened N-terminal protein ~44 kDa in size. To inhibit the expression of this truncated N-terminal protein, VP77 was further modified by introducing synonymous mutations in the region (5' GAA ACA GGA ACA ACA GAA ACT ATG 3') spanning the Shine-Dalgarno-like sequence (underlined) till the second in-frame ATG (italics). Different combinations of mutations were introduced and using the UTR Designer tool [81], the translation initiation rates from the second start codon were predicted. The minimal synonymous mutations that gave the least translation initiation rate from the second in-frame ATG was selected for chemical synthesis. The bacterial expression plasmid pET28b(+) was used to clone this chemically-synthesized DNA to create VP80 containing the modified Shine-Dalgarno-like sequence (mutated nucleotides shown in bold; 5' GAA ACC GGC ACT ACC GAA ACT ATG 3'). To ensure that SIV Pr55 Gag was in-frame with an intact His 6 -tag, VP77 and VP80 clones were confirmed by sequencing.

Construction of full-length SIV Pr55 Gag eukaryotic expression plasmids
Eukaryotic expression plasmids expressing full-length SIV Pr55 Gag with and without His 6 -tag were also created. Using the V.N. Pillai et al. prokaryotic expression plasmid VP77 as the template, PCR amplification was performed using oligo set OTR1549 and OTR1550 to create SIV Pr55 Gag with His 6 -tag, and oligo set OTR1549 and OTR1551 to create SIV Pr55 Gag without His 6 -tag. OTR1549 (5' ccg CTC GAG GCC GCC ACC ATG GGC GTG AGA AAC TCC GTC 3') was the sense oligo containing 3 dummy nucleotides (lowercase) followed by an XhoI site (italicized) and a Kozak sequence (underlined) just upstream of Gag initiation codon (bold). OTR1550 (5' T TCT CTC TTT GGA GGA GAC CAG CAC CAC CAC CAC CAC CAC TAG CTC GAG cgg 3') was the anti-sense oligo containing the SIV gag sequences followed by His 6 -tag sequence (underlined), termination codon (bold), an XhoI site (italicized) and 3 dummy nucleotides (lowercase). OTR1551 was another anti-sense oligo (5' T CTC TTT GGA GGA GAC CAG TAG CTC GAG cgg 3') containing SIV gag sequences followed by the termination codon (bold), an XhoI site (italicized) and 3 dummy nucleotides (lowercase). The Phusion High-Fidelity PCR Kit (New England Biolabs) was used for PCR amplification with its standard PCR conditions and primer annealing at 61 • C for 30 s. The PCR-amplified products with and without His 6 -tag were cleaved with XhoI restriction endonuclease and cloned into a eukaryotic expression plasmid (pcDNA3) that had already been cleaved with XhoI to generate VP78 and VP79, respectively. In order to facilitate efficient SIV Gag mRNA nuclear export and translation, the constitutive transport element (CTE) from MPMV [92,[98][99][100][101] was cloned immediately after the gag stop codon, as has been reported earlier [75][76][77]. The resulting clones, VP78 and VP79, were confirmed by sequencing. Details of the primer pairs used for introducing specific mutations and/or for cloning are listed in Supplementary Table 1. The specific conditions employed during PCR cycles as well as complete details of cloning are available from the authors upon request.

Growth media and bacterial strains used for cloning and protein expression
During the course of cloning, the ligated DNAs were introduced into the bacteria by transforming DH5α strain of E. coli using the standard heat shock protocol in the presence of required antibiotics (kanamycin; 50 μg/ml and/or ampicillin; 100 μg/ml) in Luria-Bertani (LB) media, as described earlier [75][76][77]. For bacterial protein expression, the prokaryotic recombinant protein clones VP77 and VP80 were transformed into T7 Express (New England Bio Labs), a BL21 derivative of E. coli. LB medium containing 50 μg/ml of kanamycin was used to culture a single colony following transformation, as reported earlier [75][76][77].

Large scale expression of recombinant SIV Pr55 Gag -His 6 -Tagged fusion protein in bacteria
An isolated colony from transformed BL21 cells was used to inoculate 50 ml of LB media containing appropriate antibiotic (kanamycin 50 μg/ml) and grown overnight at 37 • C while shaking at 200 revolutions per minute (rpm). Next morning, 2-L baffled Erlenmeyer flasks containing 500 ml LB with 50 μg/ml kanamycin were inoculated with this overnight culture and grown at 28 • C until the optical density at 600 nm (OD600) reached approximately 0.6. This was followed by 0.4 mM IPTG induction and the culture was allowed to grow for an additional 4 h at 28 • C, as described previously [75][76][77]. At 4 h, the bacterial culture was centrifuged at 6300×g for 15 min at 4 • C and the pellets were frozen at − 80 • C for monitoring protein expression and further purification.

Recombinant SIV Pr55 Gag -His 6 -tagged fusion protein purification by IMAC and size exclusion chromatography
The recombinant SIV Pr55Gag-His 6 -tagged fusion protein was purified as has been reported earlier for MPMV, MMTV, and FIV [75][76][77]. Briefly, to lyse the bacterial pellets, ice cold CelLytic B buffer (Sigma-Aldrich, USA) was used after adding EDTA-free protease inhibitor, lysozyme, and benzonase. The lysed cells were centrifuged at 48,000×g for 1 h at 4 • C to collect the soluble fraction of the protein which was mixed with a 4X binding buffer (4.0 M NaCl, 0.2 M Tris-HCl of pH 8.0, 100 mM imidazole, 40 mM β-mercaptoethanol, 10 mM dithiothreitol (DTT), 0.4% (w/v) Tween-20) to get a final concentration of 1X. The diluted lysate was filtered and applied to a HisTRAPTM FF 5 ml column attached to the BIO-RAD NGC liquid chromatography system and equilibrated with 1X equilibration buffer (1 M NaCl, 50 mM Tris-HCl (pH 8.0), 25 mM imidazole, 10 mM β-mercaptoethanol, 2.5 mM DTT, 0.1% (w/v) Tween-20 and 10% (v/v) glycerol). After applying the lysate onto the column, it was washed with 5 column volumes of the same equilibration buffer containing 25 mM imidazole. The bound His 6 -tagged protein was then eluted using 5 column volumes of elution buffer which is the same equilibration buffer containing 250 mM imidazole. The column was further washed with 500 mM and 1 M imidazole containing buffers to ensure complete removal of bound protein. Amicon® Ultra 15 (30,000 molecular weight cut-off membrane) column was used to concentrate the eluted protein and the quality of the protein was analyzed by SDS-PAGE followed by staining with Coomassie Brilliant Blue and immunoblotting. To further purify the protein to homogeneity, size exclusion chromatography was employed. The concentrated protein was injected onto the Superdex 200 Increase 10/300 GL column attached to the BIO-RAD NGC liquid chromatography system as 1 ml fractions at a concentration of 2 mg/ml. The column was equilibrated with 1X gel filtration buffer (50 mM Tris-HCl (pH 8.0), 1 M NaCl, 1 mM DTT) and the protein was eluted in the same buffer. The resolved protein fractions that gave a peak were further evaluated by SDS-PAGE and the cleanest fractions were pooled and concentrated into 2 mg/ml aliquots which were flash frozen and stored at − 80 • C. Purity of the pooled fractions of SIV Pr55 Gag -His 6 -tagged fusion protein was determined by measuring the 260/280 nm absorbance ratio.

In vitro assembly of SIV Pr55 Gag -His 6 -Tagged fusion protein to form VLPs
To study the ability of SIV Pr55 Gag -His 6 -tagged recombinant fusion protein to form VLPs, 200 μl of 2 mg/ml of the protein (in 1 M NaCl, 50 mM Tris-HCl (pH 8.0), 1 mM DTT) was added to yeast tRNA at a protein to RNA ratio of 4% (w/w). This protein-tRNA mixture was allowed to dialyze into a buffer containing 150 mM NaCl, 50 mM Tris (pH 8.0), and 10 mM DTT overnight at 4 • C. This was followed by spotting 8 μl of the sample on to a carbon-coated formvar grid which was stained by uranyl acetate and used for TEM, as described previously [75][76][77].

Visualization of SIV VLPs produced in eukaryotic cells by SIV Pr55 Gag -His 6 -Tagged fusion protein
To visualize VLPs produced by the SIV Pr55 Gag -His 6 -tagged fusion protein using TEM, an expression plasmid expressing this protein was transfected into human embryonic kidney HEK 293T cells which were processed as described previously [77]. Briefly, approximately 72 h post transfection, cells were harvested via trypsinization, pelleted, and fixed in Karnovsky's fixative. Sections (ultrathin; 95 nm) of the resin embedded samples were stained for TEM using 1% osmium tetroxide, as described previously and visualized using TEM (FEI Tecnai Biotwin Spirit G2) [77].

Eukaryotic expression of SIV Pr55 Gag -His 6 -Tagged fusion protein
Eukaryotic expression was monitored by transfecting HEK293T cells with the pcDNA3-based SIV Pr55 Gag eukaryotic expression plasmids VP78 (His+) and VP79 (His-), as described in our recent studies [75][76][77]. To isolate VLPs, ~72 h after transfection, media from the transfected cells was harvested and subjected to ultracentrifugation. Isolated VLPs were then used for RNA extraction and western blotting.

Reverse transcriptase PCR (RT-PCR) and RT-qPCR to estimate relative packaging efficiency (RPE) of SIV gRNA
The efficiency of SIV RNA being packaged into the recombinant Gag VLPs was estimated using RT-qPCR. For this purpose, a customized RT-qPCR was developed that was based on SYBR Green chemistry that measured the amount of SIV subgenomic RNA expressed from the SIV transfer vector (MB41, [65]. This employed two newly designed oligos (OTR1650/OTR1651; Supplementary Table 1) at 300 nM. PCR was conducted using the SYBR Green qPCR master mix from Solis BioDyne (5x Hot FirePol EvaGreen qPCR Supermix). Expression of the SIV-specific signal was quantified using the 2 − ΔΔCT method, as described previously [68], using β-actin as the endogenous control (OFM456/OTR1199; Supplementary Table 1) at a concentration of 500 nM. Briefly, the most suitable primer concentration was established by testing various primer concentrations that provided maximal amplification with a single, sharp peak following melt curve analysis. The samples were tested in triplicates using the Applied Biosystems 7500 ABI QuantStudioTM7 Flex System (Applied Biosystems, USA) for 40 cycles at an annealing temperature of 60 • C. The relative packaging efficiency (RPE) of SIV gRNA by each of the SIV Pr55 Gag constructs was quantified by estimating the ratio of RNA packaged into the virus particles relative to its cytoplasmic expression normalized to the secreted alkaline phosphatase (SEAP) expression levels for each sample (RPE = Virion RNA content/SEAP-normalized Cyt RNA expression). The significance of the results was expressed in p values using the student's t-test.

Data availability statement
Data included in article/supp. material/referenced in article.

Declaration of interest's statement
The authors declare no competing interests.